fine-tuned gpt-2
Promptor: A Conversational and Autonomous Prompt Generation Agent for Intelligent Text Entry Techniques
Shen, Junxiao, Dudley, John J., Zheng, Jingyao, Byrne, Bill, Kristensson, Per Ola
Text entry is an essential task in our day-to-day digital interactions. Numerous intelligent features have been developed to streamline this process, making text entry more effective, efficient, and fluid. These improvements include sentence prediction and user personalization. However, as deep learning-based language models become the norm for these advanced features, the necessity for data collection and model fine-tuning increases. These challenges can be mitigated by harnessing the in-context learning capability of large language models such as GPT-3.5. This unique feature allows the language model to acquire new skills through prompts, eliminating the need for data collection and fine-tuning. Consequently, large language models can learn various text prediction techniques. We initially showed that, for a sentence prediction task, merely prompting GPT-3.5 surpassed a GPT-2 backed system and is comparable with a fine-tuned GPT-3.5 model, with the latter two methods requiring costly data collection, fine-tuning and post-processing. However, the task of prompting large language models to specialize in specific text prediction tasks can be challenging, particularly for designers without expertise in prompt engineering. To address this, we introduce Promptor, a conversational prompt generation agent designed to engage proactively with designers. Promptor can automatically generate complex prompts tailored to meet specific needs, thus offering a solution to this challenge. We conducted a user study involving 24 participants creating prompts for three intelligent text entry tasks, half of the participants used Promptor while the other half designed prompts themselves. The results show that Promptor-designed prompts result in a 35% increase in similarity and 22% in coherence over those by designers.
I Fine-Tuned GPT-2 on 100K Scientific Papers
After fine-tuning the model, I wanted to understand what the model has learned and how the generated text is influenced by the fact that paper abstracts were used for training. First, I generated a sample text by using "the role of recommender systems" as a prompt. This result sounded somehow copied & pasted from one of the existing abstracts, but after a check with some anti-plagiarism solutions, I realized that it is 100% unique. During learning, the model captured common features of the abstracts and learned how to replicate them while still generating fresh text. Interestingly, the model used scientific language and common expressions: The previous works…, In this paper…, We propose…, The experimental result….
r/MachineLearning - [P] DialogPT: State of the Art Conversational Model with Fine-Tuned GPT-2 (Microsoft Research)
I've managed to get the model running generation on my PC. One thing needed to point out is that the checkpoint can NOT be loaded exactly as the GPT-2 model checkpoint from Huggingface pytorch-transformer repository. You'll also need to manually define the config, e.g. The generation works just fine by a nucleus sampling approach, and once in a while an E-O-T will be given to indicate end of one post. Bot: they're having an open gym soon in June... dont think they'll be up there this time though Bot: So what's this gym called?